18 research outputs found

    Assessing Centrality Without Knowing Connections

    Full text link
    We consider the privacy-preserving computation of node influence in distributed social networks, as measured by egocentric betweenness centrality (EBC). Motivated by modern communication networks spanning multiple providers, we show for the first time how multiple mutually-distrusting parties can successfully compute node EBC while revealing only differentially-private information about their internal network connections. A theoretical utility analysis upper bounds a primary source of private EBC error---private release of ego networks---with high probability. Empirical results demonstrate practical applicability with a low 1.07 relative error achievable at strong privacy budget ϵ=0.1\epsilon=0.1 on a Facebook graph, and insignificant performance degradation as the number of network provider parties grows.Comment: Full report of paper appearing in PAKDD202

    Reversible Data Perturbation Techniques for Multi-level Privacy-preserving Data Publication

    Get PDF
    The amount of digital data generated in the Big Data age is increasingly rapidly. Privacy-preserving data publishing techniques based on differential privacy through data perturbation provide a safe release of datasets such that sensitive information present in the dataset cannot be inferred from the published data. Existing privacy-preserving data publishing solutions have focused on publishing a single snapshot of the data with the assumption that all users of the data share the same level of privilege and access the data with a fixed privacy level. Thus, such schemes do not directly support data release in cases when data users have different levels of access on the published data. While a straight-forward approach of releasing a separate snapshot of the data for each possible data access level can allow multi-level access, it can result in a higher storage cost requiring separate storage space for each instance of the published data. In this paper, we develop a set of reversible data perturbation techniques for large bipartite association graphs that use perturbation keys to control the sequential generation of multiple snapshots of the data to offer multi-level access based on privacy levels. The proposed schemes enable multi-level data privacy, allowing selective de-perturbation of the published data when suitable access credentials are provided. We evaluate the techniques through extensive experiments on a large real-world association graph dataset and our experiments show that the proposed techniques are efficient, scalable and effectively support multi-level data privacy on the published data

    Order-Revealing Encryption and the Hardness of Private Learning

    Full text link
    An order-revealing encryption scheme gives a public procedure by which two ciphertexts can be compared to reveal the ordering of their underlying plaintexts. We show how to use order-revealing encryption to separate computationally efficient PAC learning from efficient (ϵ,δ)(\epsilon, \delta)-differentially private PAC learning. That is, we construct a concept class that is efficiently PAC learnable, but for which every efficient learner fails to be differentially private. This answers a question of Kasiviswanathan et al. (FOCS '08, SIAM J. Comput. '11). To prove our result, we give a generic transformation from an order-revealing encryption scheme into one with strongly correct comparison, which enables the consistent comparison of ciphertexts that are not obtained as the valid encryption of any message. We believe this construction may be of independent interest.Comment: 28 page

    Differentially Private Neighborhood-based Recommender Systems

    Get PDF
    Privacy issues of recommender systems have become a hot topic for the society as such systems are appearing in every corner of our life. In contrast to the fact that many secure multi-party computation protocols have been proposed to prevent information leakage in the process of recommendation computation, very little has been done to restrict the information leakage from the recommendation results. In this paper, we apply the differential privacy concept to neighborhood-based recommendation methods (NBMs) under a probabilistic framework. We first present a solution, by directly calibrating Laplace noise into the training process, to differential-privately find the maximum a posteriori parameters similarity. Then we connect differential privacy to NBMs by exploiting a recent observation that sampling from the scaled posterior distribution of a Bayesian model results in provably differentially private systems. Our experiments show that both solutions allow promising accuracy with a modest privacy budget, and the second solution yields better accuracy if the sampling asymptotically converges. We also compare our solutions to the recent differentially private matrix factorization (MF) recommender systems, and show that our solutions achieve better accuracy when the privacy budget is reasonably small. This is an interesting result because MF systems often offer better accuracy when differential privacy is not applied

    Separating Computational and Statistical Differential Privacy in the Client-Server Model

    Get PDF
    Differential privacy is a mathematical definition of privacy for statistical data analysis. It guarantees that any (possibly adversarial) data analyst is unable to learn too much information that is specific to an individual. Mironov et al.~(CRYPTO 2009) proposed several computational relaxations of differential privacy (CDP), which relax this guarantee to hold only against computationally bounded adversaries. Their work and subsequent work showed that CDP can yield substantial accuracy improvements in various multiparty privacy problems. However, these works left open whether such improvements are possible in the traditional client-server model of data analysis. In fact, Groce, Katz and Yerukhimovich~(TCC 2011) showed that, in this setting, it is impossible to take advantage of CDP for many natural statistical tasks. Our main result shows that, assuming the existence of sub-exponentially secure one-way functions and 2-message witness indistinguishable proofs (zaps) for NP, that there is in fact a computational task in the client-server model that can be efficiently performed with CDP, but is infeasible to perform with information-theoretic differential privacy

    Identification of ORC1/CDC6-Interacting Factors in Trypanosoma brucei Reveals Critical Features of Origin Recognition Complex Architecture

    Get PDF
    DNA Replication initiates by formation of a pre-replication complex on sequences termed origins. In eukaryotes, the pre-replication complex is composed of the Origin Recognition Complex (ORC), Cdc6 and the MCM replicative helicase in conjunction with Cdt1. Eukaryotic ORC is considered to be composed of six subunits, named Orc1–6, and monomeric Cdc6 is closely related in sequence to Orc1. However, ORC has been little explored in protists, and only a single ORC protein, related to both Orc1 and Cdc6, has been shown to act in DNA replication in Trypanosoma brucei. Here we identify three highly diverged putative T. brucei ORC components that interact with ORC1/CDC6 and contribute to cell division. Two of these factors are so diverged that we cannot determine if they are eukaryotic ORC subunit orthologues, or are parasite-specific replication factors. The other we show to be a highly diverged Orc4 orthologue, demonstrating that this is one of the most widely conserved ORC subunits in protists and revealing it to be a key element of eukaryotic ORC architecture. Additionally, we have examined interactions amongst the T. brucei MCM subunits and show that this has the conventional eukaryotic heterohexameric structure, suggesting that divergence in the T. brucei replication machinery is limited to the earliest steps in origin licensing

    Local differential privacy : tools, challenges, and opportunities

    No full text
    Web Information Systems Engineering, WISE 2019, Workshop, Demo, and Tutorial, Hong Kong and Macau, China, January 19–22, 2020202009 bcrcAccepted ManuscriptPublishe

    Publishing graph node strength histogram with edge differential privacy

    No full text
    Protecting the private graph data while releasing accurate estimate of the data is one of the most challenging problems in data privacy. Node strength combines the topological information with the weight distribution of the weighted graph in a natural way. Since an edge in graph data oftentimes represents relationship between two nodes, edge-differential privacy (edge-DP) can protect relationship between two entities from being disclosed. In this paper, we investigate the problem of publishing the node strength histogram of a private graph under edge-DP. We propose two clustering approaches based on sequence-aware and local density to aggregate histogram. Our experimental study demonstrates that our approaches can greatly reduce the error of approximating the true node strength histogram

    A Bayesian Nonparametric Approach to Differentially Private Data

    No full text
    The protection of private and sensitive data is an important problem of increasing interest due to the vast amount of personal data collected. Differential Privacy is arguably the most dominant approach to address privacy protection, and is currently implemented in both industry and government. In a decentralized paradigm, the sensitive information belonging to each individual will be locally transformed by a known privacy-maintaining mechanism Q. The objective of differential privacy is to allow an analyst to recover the distribution of the raw data, or some functionals of it, while only having access to the transformed data. In this work, we propose a Bayesian nonparametric methodology to perform inference on the distribution of the sensitive data, reformulating the differentially private estimation problem as a latent variable Dirichlet Process mixture model. This methodology has the advantage that it can be applied to any mechanism Q and works as a “black box” procedure, being able to estimate the distribution and functionals thereof using the same MCMC draws and with very little tuning. Also, being a fully nonparametric procedure, it requires very little assumptions on the distribution of the raw data. For the most popular mechanisms Q, like Laplace and Gaussian, we describe efficient specialized MCMC algorithms and provide theoretical guarantees. Experiments on both synthetic and real dataset show a good performance of the proposed method

    PRIPEL: Privacy-preserving event log publishing including contextual information

    Full text link
    Event logs capture the execution of business processes in terms of executed activities and their execution context. Since logs contain potentially sensitive information about the individuals involved in the process, they should be pre-processed before being published to preserve the individuals’ privacy. However, existing techniques for such pre-processing are limited to a process’ control-flow and neglect contextual information, such as attribute values and durations. This thus precludes any form of process analysis that involves contextual factors. To bridge this gap, we introduce PRIPEL, a framework for privacy-aware event log publishing. Compared to existing work, PRIPEL takes a fundamentally different angle and ensures privacy on the level of individual cases instead of the complete log. This way, contextual information as well as the long tail process behaviour are preserved, which enables the application of a rich set of process analysis techniques. We demonstrate the feasibility of our framework in a case study with a real-world event log
    corecore